BEST-COST GitHub repository

Let’s start by checking out the BEST-COST GitHub repo and the README file

Landing page of the BEST-COST GitHub repo. The README file tells you how to get started. The folder r_package contains package-related files (including function code). Developments are discussed and documented under Issues.

Landing page of the BEST-COST GitHub repo. The README file tells you how to get started. The folder r_package contains package-related files (including function code). Developments are discussed and documented under Issues.

README

The README file contains the following information

README file of the healthiar R package

About healthiar

The main function family of healthiar is the attribute family, used to attribute health impacts to a (environmental) risk factor, e.g. noise or air pollution. The core function attribute_health can be used for most assessments.

Figure: healthiar overview

healthiar in RStudio 1/3

RStudio startup screen

healthiar in RStudio 2/3

Post installation, you can access the healthiar package landing page in RStudio by going to the Packages tab and then clicking on the healthiar package.

healthiar in RStudio 3/3

note: first screenshot of full screen note: mention vignette opens as pdf or html

Landing page of the healthiar package in RStudio, where you find the package vignettes and function documentation.

Vignette

The vignette introduces healthiar step-by-step and contains (reproducible) examples. You can open the intro_to_healthiar vignette within RStudio or as a HTML within your browser.

Note

The vignette is a work in progress: we appreciate any feedback or suggestions you might have to make it more useful to future users!

Function documentation (1/) - Overview

Access the function documentation (= fun doc) by clicking on a function name in the package landing page.

Tip

When the package is loaded (via library(healthiar)) access the fun doc of e.g. attribute_health by running ?attribute_health.

Any fun doc contains the following sections: - Title Essence of the function - Description What does the function do exactly? - Usage Bare-minimum examples of how to use the function (includes default values if there are any) - Arguments Short description of each function argument: input type (numeric vs. string), options (if available), how each argument affects the output. - Details (optional) Additional details about the function - Value Information about the function output - Examples (optional) Shows how the function works

Title

TODO add screenshot + text

Description

TODO add screenshot + text

Usage

TODO add screenshot + text

Arguments

TODO add screenshot + text

Details

TODO add screenshot + text

Warning: not very developped right now. More details in the vignette.

note: show fun doc of attribute_healthiar

Value

TODO add screenshot + text

Workflow BoD

note: quickly introduce the main workflow of burden assessments

Function families

note: Quickly show function families (do after AC’s done merge)

Example: attribute call without input uncertainties

Goal: attribute COPD cases to air pollution

results_pm_copd <-
  healthiar::attribute_health(
    erf_shape = "log_linear", # Alternatives: "linear", "log_log"
    rr_central = 1.369, 
    rr_increment = 10,  # μg / m^3
    exp_central = 8.85, # μg / m^3
    cutoff_central = 5, # μg / m^3
    bhd_central = 30747 # baseline health data: COPD incidence
  ) 

Tip

healthiar comes with some example data that start with exdat_ that allow you to test functions.

results_pm_copd <-
  healthiar::attribute_health(
    erf_shape = "log_linear",
    rr_central = exdat_pm_copd$relative_risk, 
    rr_lower = exdat_pm_copd$relative_risk_lower,
    rr_upper = exdat_pm_copd$relative_risk_upper,
    rr_increment = 10, 
    exp_central = exdat_pm_copd$mean_concentration,
    cutoff_central = exdat_pm_copd$cut_off_value,
    bhd_central = exdat_pm_copd$incidents_per_100_000_per_year/1E5*exdat_pm_copd$population_at_risk,
    # bhd_central = exdat_pm_copd$incidence # Uncomment once change committed to main
  ) 

Output structure

Every attribute output consists of two lists (“folders”)

  • health_main contains the main results

  • health_detailed detailed results (and in some cases even more information about the assessment/calculation)

note: add here that format is tibble (but can be changed to data frame)

note: screenshot of how to click on variable in env (put that as first option) . . .

Different ways to access the results

Tip

This is personal preference! However, you might encounter all options.

Go to the Environment tab in RStudio and click on a variable to “open” it. Alternatively, you can use View(results_pm_copd), which has the same effect.

results_pm_copd$health_main$impact_rounded

Note: after typing the $ sign you can see all available options by pressing the tab key and use the arrows & tab keys to select an option (or alternatively use the mouse)

results_pm_copd[["health_main"]]

Note: if the cursor is located within the square braces you can see all available options by pressing the tab key

Using the purrr::pluck function to select a list and then the dplyr::pull function extract values from a specified column

results_pm_copd |> purrr::pluck("health_main") |> dplyr::pull("impact_rounded")

Note: available options can’t be displayed automatically using these functions -> better suited for a more permanent analysis script

Let’s check the main output!

results_pm_copd[["health_main"]]
impact_rounded impact pop_fraction erf_ci rr exp bhd
3502 3501.962 0.1138961 central 1.369 8.85 30747
1353 1353.066 0.0440064 lower 1.124 8.85 30747
5474 5473.888 0.1780300 upper 1.664 8.85 30747

Tip

Each row shows a result obtained with all the input data & calculation pathway specifications shown in that row

Some of the most relevant columns include: - impact_rounded Rounded attributable health impact/burden - impact Raw impact/burden - pop_fraction Population attributable fraction (PAF) - erf_ci Specifies whether rr_central, ..._lower or ..._upper was used in calculation - rr Specifies raw rr used in calculation - - exp - bhd*

Example: attribute with input uncertainties

Goal: attribute lung cancer deaths to PM2.5 exposure

results_pm_copd <-
  healthiar::attribute_health(
    erf_shape = "log_linear",
    rr_central = 1.369, 
    rr_lower = 1.124, 
    rr_upper = 1.664,
    rr_increment = 10, 
    exp_central = 8.85, 
    exp_lower = 8, 
    exp_upper = 10,
    cutoff_central = 5,
    bhd_central = 30747, 
    bhd_lower = 28000, 
    bhd_upper = 32000
  ) 

Let’s check the detailed output!

Tip

See the intro vignette for a detailed description of output columns.

The health_detailed output table contains all different combinations of the arguments with uncertainty. E.g. rr_central with exp_lower and bhd_upper, …

results_pm_copd[["health_detailed"]][["raw"]]
geo_id_disaggregated erf_ci exp_ci bhd_ci cutoff_ci pop_fraction impact prop_pop_exp rr_increment erf_shape exposure_name approach_risk health_outcome exposure_dimension exposure_type exp rr bhd cutoff pop_fraction_type rr_conc impact_rounded
1 central central central central 0.1138961 3501.9619 1 10 log_linear NA relative_risk same_input_output 1 population_weighted_mean 8.85 1.369 30747 5 paf 1.128536 3502
1 central central lower central 0.1138961 3189.0894 1 10 log_linear NA relative_risk same_input_output 1 population_weighted_mean 8.85 1.369 28000 5 paf 1.128536 3189
1 central central upper central 0.1138961 3644.6736 1 10 log_linear NA relative_risk same_input_output 1 population_weighted_mean 8.85 1.369 32000 5 paf 1.128536 3645
1 lower central central central 0.0440064 1353.0658 1 10 log_linear NA relative_risk same_input_output 1 population_weighted_mean 8.85 1.124 30747 5 paf 1.046032 1353
1 lower central lower central 0.0440064 1232.1801 1 10 log_linear NA relative_risk same_input_output 1 population_weighted_mean 8.85 1.124 28000 5 paf 1.046032 1232
1 lower central upper central 0.0440064 1408.2058 1 10 log_linear NA relative_risk same_input_output 1 population_weighted_mean 8.85 1.124 32000 5 paf 1.046032 1408
1 upper central central central 0.1780300 5473.8882 1 10 log_linear NA relative_risk same_input_output 1 population_weighted_mean 8.85 1.664 30747 5 paf 1.216589 5474
1 upper central lower central 0.1780300 4984.8398 1 10 log_linear NA relative_risk same_input_output 1 population_weighted_mean 8.85 1.664 28000 5 paf 1.216589 4985
1 upper central upper central 0.1780300 5696.9598 1 10 log_linear NA relative_risk same_input_output 1 population_weighted_mean 8.85 1.664 32000 5 paf 1.216589 5697
1 central lower central central 0.0899213 2764.8092 1 10 log_linear NA relative_risk same_input_output 1 population_weighted_mean 8.00 1.369 30747 5 paf 1.098806 2765
1 central lower lower central 0.0899213 2517.7955 1 10 log_linear NA relative_risk same_input_output 1 population_weighted_mean 8.00 1.369 28000 5 paf 1.098806 2518
1 central lower upper central 0.0899213 2877.4806 1 10 log_linear NA relative_risk same_input_output 1 population_weighted_mean 8.00 1.369 32000 5 paf 1.098806 2877
1 lower lower central central 0.0344604 1059.5528 1 10 log_linear NA relative_risk same_input_output 1 population_weighted_mean 8.00 1.124 30747 5 paf 1.035690 1060
1 lower lower lower central 0.0344604 964.8902 1 10 log_linear NA relative_risk same_input_output 1 population_weighted_mean 8.00 1.124 28000 5 paf 1.035690 965
1 lower lower upper central 0.0344604 1102.7316 1 10 log_linear NA relative_risk same_input_output 1 population_weighted_mean 8.00 1.124 32000 5 paf 1.035690 1103
1 upper lower central central 0.1416706 4355.9450 1 10 log_linear NA relative_risk same_input_output 1 population_weighted_mean 8.00 1.664 30747 5 paf 1.165054 4356
1 upper lower lower central 0.1416706 3966.7760 1 10 log_linear NA relative_risk same_input_output 1 population_weighted_mean 8.00 1.664 28000 5 paf 1.165054 3967
1 upper lower upper central 0.1416706 4533.4583 1 10 log_linear NA relative_risk same_input_output 1 population_weighted_mean 8.00 1.664 32000 5 paf 1.165054 4533
1 central upper central central 0.1453304 4468.4726 1 10 log_linear NA relative_risk same_input_output 1 population_weighted_mean 10.00 1.369 30747 5 paf 1.170043 4468
1 central upper lower central 0.1453304 4069.2501 1 10 log_linear NA relative_risk same_input_output 1 population_weighted_mean 10.00 1.369 28000 5 paf 1.170043 4069
1 central upper upper central 0.1453304 4650.5716 1 10 log_linear NA relative_risk same_input_output 1 population_weighted_mean 10.00 1.369 32000 5 paf 1.170043 4651
1 lower upper central central 0.0567717 1745.5580 1 10 log_linear NA relative_risk same_input_output 1 population_weighted_mean 10.00 1.124 30747 5 paf 1.060189 1746
1 lower upper lower central 0.0567717 1589.6063 1 10 log_linear NA relative_risk same_input_output 1 population_weighted_mean 10.00 1.124 28000 5 paf 1.060189 1590
1 lower upper upper central 0.0567717 1816.6929 1 10 log_linear NA relative_risk same_input_output 1 population_weighted_mean 10.00 1.124 32000 5 paf 1.060189 1817
1 upper upper central central 0.2247829 6911.4001 1 10 log_linear NA relative_risk same_input_output 1 population_weighted_mean 10.00 1.664 30747 5 paf 1.289961 6911
1 upper upper lower central 0.2247829 6293.9214 1 10 log_linear NA relative_risk same_input_output 1 population_weighted_mean 10.00 1.664 28000 5 paf 1.289961 6294
1 upper upper upper central 0.2247829 7193.0531 1 10 log_linear NA relative_risk same_input_output 1 population_weighted_mean 10.00 1.664 32000 5 paf 1.289961 7193

Example: attribute with exposure categories (noise)

Goal: attribute cases of high annoyance (HA) to noise exposure

exdat_noise_ha <-
  exdat_noise_ha |>
  dplyr::filter(!is.na(exdat_noise_ha$exposure_mean))

results_noise_ha <- 
  healthiar::attribute_health(
    approach_risk = "absolute_risk",
    exp_central = c(57.5, 62.5, 67.5, 72.5, 77.5),
    population = sum(exdat_noise_ha$population_exposed_total), # TO DO: hard code input here
    prop_pop_exp = exdat_noise_ha$population_exposed_total / 
      sum(exdat_noise_ha$population_exposed_total),
    erf_eq_central = "78.9270-3.1162*c+0.0342*c^2")

Source input data: NIPH
results_noise_ha[["health_detailed"]][["raw"]] |> knitr::kable()
geo_id_disaggregated erf_ci exp_ci population prop_pop_exp exposure_name approach_risk health_outcome exposure_dimension exposure_type exp erf_eq absolute_risk_as_percent pop_exp impact impact_rounded impact_per_100k_inhab
1 central central 945200 0.4099661 NA absolute_risk same_input_output 1 exposure_distribution 57.5 78.9270-3.1162c+0.0342c^2 12.81925 387500 49674.594 49675 5255.4585
1 central central 945200 0.3025815 NA absolute_risk same_input_output 2 exposure_distribution 62.5 78.9270-3.1162c+0.0342c^2 17.75825 286000 50788.595 50789 5373.3173
1 central central 945200 0.2029200 NA absolute_risk same_input_output 3 exposure_distribution 67.5 78.9270-3.1162c+0.0342c^2 24.40725 191800 46813.105 46813 4952.7196
1 central central 945200 0.0763860 NA absolute_risk same_input_output 4 exposure_distribution 72.5 78.9270-3.1162c+0.0342c^2 32.76625 72200 23657.232 23657 2502.8811
1 central central 945200 0.0081464 NA absolute_risk same_input_output 5 exposure_distribution 77.5 78.9270-3.1162c+0.0342c^2 42.83525 7700 3298.314 3298 348.9541

Iteration example

Compare example

note: definitely include because WP5 will do

Additional analyses

Monetization

Social analysis

note: if time allows

Other features

Check out the intro vignette for examples (to be added)

  • correlated exposures

  • life table analysis

  • get_daly

Exporting results

note: show how to export csv, …

Visualize

Out of scope of healthiar

  • Using R, e.g. with ggplot2 package

  • Using Excel (once results are exported)

Introduce homework

note: stress that they must install & test healthiar before 2nd WS

note: AC: not more than 2-3 exercises, one with provided numbers, one where they use exdat

note: put exercises in PDF on Teams

If you encounter challenges during installation, get in touch with us!

Outlook WS2

note: mention programme WS2 & explicitly YOU WILL HAVE TO PROGRAMME IN RSTUDIO USING HEALTHIAR

note: document any suggestions for improvements